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Abstract 

We briefly summarise the "MSTW 2008" determination of parton distribution functions (PDFs), and subsequent 
follow-up studies, before reviewing some topical issues concerning the PDF dependence of cross sections at the 
Tevatron and LHC. We update a recently published study of benchmark Standard Model total cross sections (W, Z, 
gg — > H and tt production) at the 7 TeV LHC, where we account for all publicly available PDF sets and we compare 
to LHC data for W, Z, and tt production. We show the sensitivity of the Higgs cross sections to the gluon distribution, 
then we demonstrate the ability of the Tevatron jet data, and also the LHC tt data, to discriminate between PDF sets 
with different high-x gluon distributions. We discuss the related problem of attempts to extract the strong coupling as 
from only deep-inelastic scattering data, and we conclude that a direct data constraint on the high-x gluon distribution 
is required to obtain a meaningful result. We therefore discourage the use of PDF sets obtained from "non-global" fits 
where the high-x gluon distribution is not directly constrained by data. 
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1. Introduction 

The parton distribution functions (PDFs) of the pro- 
ton are a non-negotiable input to almost all theory pre- 
dictions at hadron colliders. The proton PDFs are de- 
termined by several groups from (global) analysis of a 
wide range of deep-inelastic scattering (DIS) and re- 
lated hard-scattering data. The DIS data from HERA 
are perhaps the single most important input to global 
PDF fits. It is even possible to extract PDFs based only 
on HERA DIS data, albeit at the expense of leaving 
some kinematic regions and flavour combinations un- 
constrained, or alternatively by imposing a severe pa- 
rameter! sation constraint. The HERA data are therefore 
generally supplemented with DIS and Drell-Yan data 
from fixed-target experiments. 

Of course, we need to know PDFs to predict Tevatron 
and LHC cross sections, but this argument can also work 
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the other way around. If a cross section at a hadron col- 
lider is predicted with relatively small theoretical uncer- 
tainty, and it is sensitive to PDFs in a kinematic region 
poorly constrained by HERA and fixed-target data, then 
precise measurements of hadron collider cross sections 
can give important information on PDFs. To give an 
example, inclusive jet production at the Tevatron is cur- 
rently essential to directly constrain the high-x gluon. 

The contents of this contribution to the proceedings 
are as follows. First in Sec. [2] we briefly review the sta- 
tus of the "MSTW" determination of PDFs. However, 
this write-up will mostly be based on two recent pa- 
pers HJI21, with some updates to account for new PDF 
sets and LHC data released subsequent to their publi- 
cation. To avoid a deluge of plots we will only present 
a limited selection in this write-up, and a more exten- 
sive collection can be found at a public webpage (3). In 
Sec.|3]we describe a recent benchmark exercise and give 
the status of the most recent PDF sets from all fitting 
groups (as of December 201 1). In Sec. |4]we discuss W 
and Z production at the LHC, and the sensitivity to the 
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Figure 1: MSTW 2008 NNLO PDFs at two different Q 2 values (4). 
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Figure 2: A^ lobal as a function of as (Mi.) 1 5 1. 

quark distributions. In Sec.[5]we discuss Higgs, top-pair 
and jet production, and the sensitivity to the gluon dis- 
tribution. In Sec.[6]we discuss the values of as obtained 
from DIS data. Finally, we summarise in Sec. [7] 

2. Status of MSTW PDF analysis 

The "MSTW 2008" determination of PDFs H at 
leading-order (LO), next-to-leading order (NLO) and 
next-to-next-to-leading order (NNLO) superseded the 
previously available "MRST" PDFs. New data sets fit- 
ted included neutrino structure functions (F2 and xFy) 
from NuTeV and CHORUS, neutrino dimuon cross sec- 
tions from CCFR and NuTeV, HERA data on Ff arm and 
on inclusive jet production in DIS, and Tevatron Run II 
data on inclusive jet production, the lepton charge asym- 
metry from W decays and the Z rapidity distribution. 
The CCFR/NuTeV dimuon cross sections allowed the 



Figure 3: Astrfow as a function of (pole-mass) m c |6|. 
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Table 1 : Impact of as and m c j, variation on LHC cross sections 1 6 ] . 



strange-quark and -antiquark distributions to be fit di- 
rectly for the first time. The Tevatron Run II jet data 
were found to prefer a softer gluon distribution at high 
x than the previous Run I data used in the MRST 2001- 
2006 fits. Uncertainties on the PDFs, shown in Fig. [T[ 
were propagated from the experimental errors on the fit- 
ted data points using a new dynamic procedure for each 
eigenvector of the covariance matrix. Subsequent stud- 
ies used the same procedure to determine the experi- 
mental error on the best-fit as(M^), shown in Fig . [2| 1 5 1 , 
and on the best-fit (pole-mass) m c , shown in Fig.[3j|6|. 

Uncertainties in both a s (M 2 ) and the heavy-quark 
masses m c ^ induce an additional uncertainty in cross- 
section predictions compared to the uncertainty aris- 
ing only from PDFs. This increase in uncertainty is 
shown in Table [T] for the W, Z and gg — > H {Mh = 
120 GeV) NNLO total cross sections at the LHC with 
Vs = 7 TeV. Here, the PDF uncertainties are at 68% 
confidence-level (C.L.), and the parameters are varied 
in the ranges a s {M 2 z ) = 0.1171 ± 0.0014, m c = 1.40 ± 
0.15 GeV and mi, = 4.75 ±0.25 GeV. Varying m c leads to 
a change in crwz of just over 1%, while varying nib leads 
to a negligible change (0.1%) in cr wz , and cr H is insen- 
sitive to m c> t variation. Adding the m c j, uncertainty in 
quadrature to the 'TDF+q-,?" uncertainty does not result 
in a significant enhancement to the PDF+a,s uncertainty 
and it will therefore not be included in the remainder 
of this write-up. Further preliminary studies ||7] [8] in- 
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vestigated problems in describing the precise W — > tv 
charge asymmetry data at the Tevatron, and examined 
the impact of including the combined HERA I data 
in the MSTW global fit, where the changes were not 
large enough to warrant an immediate update [8]. 

3. Benchmark exercise 

Various fitting groups currently produce PDF 
sets: MSTW08 ffl, CTEQ6.6/CT10 [TOj El, 
NNPDF2.1 El [H, HERAPDFl.0/1.5 Il9l [141 [Bl 
ABKM09 DSI, GJR08/JR09 El ED HU |20). Past 
experience has shown that results obtained with the 
different PDF sets often do not agree within the quoted 
uncertainties. Quantifying, understanding, then hope- 
fully resolving differences in PDFs between groups 
is therefore as important, if not more important, as 
continued improvements in PDFs within groups. Some 
recent work has been initiated by the activities of the 
LHC Higgs Cross Section Working Group and the 
PDF4LHC Working Group. In particular, an exercise 
was proposed to use the most recent public NLO 
PDFs from all fitting groups to calculate some LHC 
benchmark processes at -\fs = 7 TeV, specifically 
total cross sections for production of W ± , Z°, tt and 
gg -> H for M H = {120,180,240} GeV. The aims 
were to establish the degree of compatibility and 
identify outliers amongst PDF sets, and to compare 
cross sections at the same as values, thereby showing 
to what extent differences in predictions are due to 
the different as values adopted by each group, rather 
than differences in the PDFs themselves. The results 
at NLO, initially presented by G.W. in a PDF4LHC 
meeting at CERN on 26th March 2010, formed the 
basis for the subsequent PDF4LHC Interim Report ETI 
and PDF4LHC Interim Recommendations 11221 used in 
the Handbook of LHC Higgs Cross Sections 123 1 . An 
update of the NLO comparisons to include the recent 
CT10 El and NNPDF2.1 El analyses, and an exten- 
sion of all comparisons to NNLO, was subsequently 
made in a later publication [1|. In this write-up we 
will make a further update to the comparison plots to 
include the even more recent NNPDF2.1 NNLO [ 13 1 
and HERAPDF1.5 NNLO El analyses, and we will 
compare the predictions to the latest LHC data on the 
total cross sections for W, Z and tt production. 

We will consider only public sets, defined to be those 
available for use with the latest lhapdf V5.8.6 |24|. The 
broad distinctions between data sets fitted and aspects 
of the theoretical treatment are summarised in Table [2] 
Only three groups (MSTW, CT and NNPDF) make fully 



global fits to HERA and fixed-target DIS data, fixed- 
target Drell-Yan production, and Tevatron data on W, 
Z and jet production, although GJR08 includes all these 
processes other than Tevatron W and Z production. The 
HERAPDF1.0 fit includes only the combined HERA I 
inclusive data 0, while the HERAPDF1.5 fit addition- 
ally includes the preliminary combined HERA II inclu- 
sive data. The CT10 and NNPDF2.1 global fits include 
the combined HERA I inclusive data, while the other 
fits (MSTW08, CTEQ6.6, ABKM09, GJR08/JR09) in- 
clude the older separate data from HI and ZEUS. The 
MSTW08, CT10, NNPDF2.1 and GJR08 fits include 
Tevatron Run II data, while CTEQ6.6 uses only Teva- 
tron Run I data. The original NNPDF2.1 fit has been 
reweighted to include Tevatron and LHC data on the 
W — > tv charge asymmetry, denoted NNPDF2.2 [25 1, 
but this reweighted PDF set will not be considered here. 
Most groups now treat the heavy-quark contribution 
to DIS structure functions using a general-mass vari- 
able flavour number scheme (GM-VFNS), other than 
ABKM09 and GJR08/JR09 who use a fixed flavour 
number scheme (FFNS). The change from the inade- 
quate zero-mass variable flavour number scheme (ZM- 
VFNS) to the GM-VFNS was the major improvement 
between NNPDF2.0 HD and NNPDF2.1 E), now al- 
lowing a meaningful comparison to other NLO global 
fits. The NNPDF fits parameterise the starting distribu- 
tions at = 2 GeV 2 as neural networks and use Monte 
Carlo methods for experimental error propagation. The 
other groups all use the more traditional approach of pa- 
rameterising the input PDFs as some functional form 
in x, each with a handful of free parameters, and use 
the Hessian method for experimental error propagation 
with differing values of the tolerance Ay 2 , that is, the 
change in the goodness-of-fit measure relative to the 
best-fit value. Contrary to the "standard" input param- 
eterisation at Q\ > 1 GeV 2 , the GJR08/JR09 sets use a 
"dynamical" input parameterisation of valence-like in- 
put distributions at an optimally chosen Q 2 < 1 GeV 2 , 
which gives a slightly worse fit quality and lower as 
values than the corresponding "standard" parameterisa- 
tion, but is nevertheless favoured by the GJR08/JR09 
authors. Public NNLO fits are available from MSTW08, 
NNPDF2.1, HERAPDFl.0/1.5, ABKM09 and JR09. 
(The first NNLO fits from the CT group should be avail- 
able soon.) The Tevatron jet cross sections are ex- 
cluded from the JR09 fit, where complete NNLO cor- 
rections are unavailable, whereas they are included in 
the MSTW08 and NNPDF2.1 NNLO fits by making the 
approximation of using the NLO partonic cross section 
supplemented by 2-loop threshold corrections ll27l . 
In Fig. H we show the default values of arj(M|) used 
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Table 2: Comparison of major PDF sets considered, and their gross features distinguished by the main classes of data included (upper part of 
table) and important aspects of the theoretical treatment (lower part of table), specifically regarding the treatment of heavy quarks in DIS and the 
provision of NNLO PDFs. More refined differences between PDF sets are described in the text. 



NLO « S (M ) values used by different PDF groups 



NNLO ctJM ) values used by different PDF groups 




68% C.L. PDF 
£ M ST WO 8 
■ CTEQ6.6 
A CT10 
O NNPDF2.1 
□ HERAPDF1.0 
A HERAPDF1.5 
ABKM09 
GJRG8 



a s (M?) 



Ai-Zr^-iA 



58% C.L. PDF 
Q MSTWG8 
O NNPDF2.1 
HERAPDF1.0 
A HERAPDF1.5 
ABKM09 
JR09 



Figure 4: Values of as (M|), and their 1-cr uncertainties, used by different PDF fitting groups at NLO and NNLO. The smaller symbols indicate the 
PDF sets with alternative values of as provided by each fitting group. The shaded band indicates the Bethke 2009 world average as (M^). 



by different fitting groups at NLO and NNLO, and 
we compare to the world average value obtained by 
S. Bethke in 2009 EH) (and taken over by the Particle 
Data Group in 2010 |29T ). The values for MSTW08, 
ABKM09 and GJR08/JR09 are obtained from a simul- 
taneous fit with the PDF parameters, while as(M^) for 
other groups is applied as an external constraint, gen- 
erally chosen to be close to the world average [28], 
and for those groups we assume a 1-cr uncertainty of 
±0.0012 ED- The smaller symbols indicate the PDF 
sets with alternative values of as(M^) values provided 
by each fitting group. The fitted NLO as(M?,) val- 
ues are always larger than the fitted NNLO as(M^) 
values in an attempt by the fit to mimic the missing 
higher-order corrections, which are generally positive. 
This trend is repeated in the recent NNPDF determina- 
tions of a s (Mp = 0.1191 ± 0.0006 at NLO 00) and 
a s (Ml) = 0.1173 + 0.0007 at NNLO EH, where the 
experimental uncertainties are obtained using Ax 2 = 1, 
but these are not used as the default a s (M|) values for 
the provided NNPDF2 . 1 PDFs . 



E q (qq) luminosity at LHC (\Js = 7 TeV) 




Figure 5: NNLO qq luminosities as the ratio to MSTW 2008. 

4. W and Z production at the LHC 

To understand properties of hadronic cross sections, 
such as PDF uncertainties or the dependence on col- 
lider energy, it is useful to consider the relevant parton- 
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Figure 6: NNLO W* (= W + + W ) and Z° total cross sections at the LHC, plotted as a function of as (Ml). 



parton luminosities. We define a qq luminosity, relevant 
for the (rapidity-integrated) total cross sections for W 
and Z production at the LHC, as 



■ J] [q(x, s)q(T/x, s) + (q^ q)] , 

q-d,u,s,c,b 



where r = s/s and is the partonic centre-of-mass 
energy. Note that this generic qq luminosity does not 
specifically include the correct flavour combinations for 
W ± or Z° production. More detailed studies would, 
for example, include the correct couplings of the vector 
bosons to quarks and antiquaries, or consider the specific 
ud (for W + ) or du (for W ) partonic luminosities. 

In Fig. [5] we show the NNLO qq luminosities as the 
ratio with respect to the MSTW 2008 NNLO luminosi- 
ties, for the LHC at V* = 7 TeV - We use me default 
as values for each set, shown in Fig. |4] The HER- 
APDF1.0 NNLO curve (without uncertainties) is for 
as(M^) = 0.1 176. The inner uncertainty bands (dashed 
lines) for HERAPDF1 .5 correspond to the (asymmetric) 
experimental errors, while the outer uncertainty bands 
(shaded regions) also include the model and parame- 
terisation errors, including uncertainties on heavy-quark 
masses but not on a s . It is not possible to separate the 
"PDF only" uncertainty for ABKM09 and JR09, there- 
fore the uncertainty bands for those sets also include the 
as uncertainty, and the uncertainty bands for ABKM09 
also include uncertainties on heavy-quark masses. This 
is undesirable but unavoidable given that these groups 
do not provide PDF sets for fixed as (and fixed m Cy b for 
ABKM09). The relevant values of = M w<z are in- 
dicated, and there is good agreement for the two global 
fits (MSTW08 and NNPDF2.1), but more variation for 
the other sets. The NNLO trend between groups is sim- 



ilar to at NLO EH, with the exception of HERAPDF 
at large s values, where the HERAPDF NLO sets have a 
much larger qq luminosity than other NLO PDF groups. 

In Fig. [6] we show the W ± (= W + + W~) and Z° to- 
tal cross section multiplied by the appropriate leptonic 
branching ratios, B(W ± -> Fv) or B(Z° -> Z+C), cal- 
culated at NNLO 021 with a scale choice fi R = fi F = 
Mw,z, plotted as a function of as (M|). The markers are 
centred on the default as (M|) value and the correspond- 
ing predicted cross section of each PDF fitting group. 
The horizontal error bars span the as (M^) uncertainty, 
the inner vertical error bars span the "PDF only" uncer- 
tainty where possible (i.e. not for ABKM09 or JR09), 
and the outer vertical error bars span the "PDF+ars" un- 
certainty. The effect of the additional as uncertainty is 
small for W and Z production. The dashed lines inter- 
polate the cross-section predictions calculated with the 
alternative PDF sets with different as (M|) values pro- 
vided by each group, represented by the smaller sym- 
bols in Fig. g] The two global fits (MSTW08 and 
NNPDF2.1) are in good agreement, as was apparent 
from the qq luminosity plot in Fig. [5] The central HER- 
APDF1.5 prediction is close to the global fits at NNLO, 
contrary to the predictions using HERAPDF1.0 NNLO 
or HERAPDF1.0/1.5 NLO ||T][D. 

The total cross-section ratios, W ± /Z° and W + /W~, 
are insensitive to NNLO corrections and the value of 
a s (Ml). The W ± (= W + + W~) and Z° total cross sec- 
tions are highly correlated, which can be understood by 
considering the dominant partonic contributions arising 
from u and d quarks, i.e. 



0~ ]y+ + 0~ ty- 



u{x\) + d(x\) 



cTzo 0.29 u(x\) + 0.37 d(jci)' 



(1) 
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NNLO W and Z cross sections at the LHC (v/s = 7 TeV) 
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Figure 7: W ± vs. Z° and W + vs. W~ total cross sections at NNLO, compared to data from CMS 1 33 1 and ATLAS (34|. 



where we have neglected the contributions with q <-» 
q, assuming that q(x 1 )q(x2) dominates over q(x 1 )q(x2), 
and the numerical values in the denominator are the 
appropriate sums of the squares of the vector and 
axial-vector couplings. We have also assumed that 
U(x2) ~ d(x2). Here, the momentum fractions are x\^_ = 
(M w / V«) exp(+y w ) and x h2 = (M z / V^) exp(+y z ), and 
yw or yz should be interpreted as some "average" rapid- 
ity appropriate for the (rapidity-integrated) total cross 
section. The combination of u- and ri-quark contribu- 
tions is very similar (in numerical prefactors, in x val- 
ues, and in Q 2 = M 2 VZ values) in both the numerator 
and denominator of Eq. ([T}, therefore the PDF depen- 
dence almost cancels in the W ± /Z° ratio. The W + and 
W~ total cross sections are much less correlated, since 

o~ w + u(x\)d(x2) u(x\) 
<Tw- d{x\)u{xi) d(xi)' 

and therefore the W + /W~ cross-section ratio is a sensi- 
tive probe of the u/d ratio and there is more variation 
between different PDF sets. 

In Fig.|7]we show W ± (= W + + W~) versus Z° and In- 
versus W~ total cross sections, and we draw dotted lines 
where the ratio of cross sections, W ± /Z° or W + /W~, is 
constant. We also compare to the experimental mea- 
surements using the 2010 LHC data from CMS 03] 
and ATLAS ll34ll . Here, the measured Z° cross sections 
have been corrected [ 1 1 for the small y* contribution and 
the finite invariant-mass range of the lepton pair (differ- 
ent for ATLAS and CMS) using a theory calculation at 



NNLO ll35ll . The spread in predictions using the dif- 
ferent PDF sets is comparable to the (dominant) lumi- 
nosity uncertainty of 4% (CMS) or 3.4% (ATLAS), and 
perhaps the two extreme predictions (HERAPDF1 .0 and 
JR09) are somewhat disfavoured. In Fig.|8]we show the 
same plots with ellipses drawn to account for the cor- 
relations between the two cross sections, both for the 
experimental measurements and for the theoretical pre- 
dictions. Here, the ellipses are defined such that the pro- 
jection onto either axes gives the 1-cr uncertainty for the 
individual cross sections, meaning that the area of the 
two-dimensional ellipse corresponds to a confidence- 
level somewhat smaller than the conventional 68%. The 
largest uncertainty in the ATLAS and CMS total cross- 
section ratios comes from extrapolating the measured 
(fiducial) cross section over the whole phase space. In- 
deed, improvements in the acceptance calculation led 
to the central value of the ATLAS W + /W~ total cross- 
section ratio shifting from the preliminary result of 1.51 
in March 201 1 J36| to 1.45 in the final publication |[34] 
following observations made in Ref. [li|. Therefore, 
data-to-theory comparisons are best made at the level 
of the fiducial cross section (i.e. within the acceptance), 
which is now possible using the public fewz l37l [38l 
and dynnlo 1 39 1 codes, and indeed was done in the AT- 
LAS publication (34) . 

Of course, as a constraint on PDFs, differential dis- 
tributions are more useful than total or fiducial cross 
sections. The W ± charge asymmetry is sensitive to the 
difference between up- and down-valence quark distri- 
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NNLO W and Z cross sections at the LHC (v/s = 7 TeV) 



NNLO W + and W" cross sections at the LHC (Vs = 7 TeV) 
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Figure 8: Same as Fig. [7] but with ellipses accounting for the correlations between the two cross sections. 



butions, i.e. 



dcr(W + )/dy w - dcr(W-)/dy w 
dcr(W + )/dy w + dcr(W-)/dy w 
u v (x\) - d v {x\) 
u(xi) + d(x\) 



Experimentally, it is not possible to directly reconstruct 
the W rapidity since the longitudinal momentum of the 
decay neutrino is unknown, so instead the W ± — > t^v 
charge asymmetry is measured as a function of the 
charged-lepton pseudorapidity i.e. 



A((rit) = 



dcr(n/d m - dcr({-)/d m 
dcrin/dije + dcr(n/dm 
= A w (y w ) ® (W ± ev), 



where the last line is symbolic and indicates the under- 
lying W ± charge asymmetry convoluted with the W ± — > 
t^v decay. Measurements of A[{ric) have provided the 
first useful PDF constraint from LHC data, and indi- 
cate that Af(0) is underestimated by the MSTW08 fit 
and is better described by some other PDF sets, imply- 
ing that u v - dy is too small at x ~ Mw/ -\fs ~ 0.01. 
This behaviour may be traced to the independent small- 
x powers, xu v oc x 029±002 an( j xc { v oc ^0.97^=0.11^ at ^ 

input scale Q\ = 1 GeV 2 for the MSTW08 NLO fit, 
compared to some other groups (e.g. CTEQ6.6/CT10) 
where the small- x powers of u v and d v are assumed to 
be equal. On the other hand, this implies some tension 
between the LHC W — > £v charge asymmetry data and 
the data already included in the MSTW08 fit (e.g. the 



Tevatron W — ■> Cv asymmetry, the NMC F^/F 1 ^ ratio, 
and the E866/NuSea Drell-Yan rr pd /cr pp ratio). Other 
tensions have been observed with the precise Tevatron 
data on the W — > (v charge asymmetry, and partially re- 
solved by more flexible nuclear corrections for deuteron 
structure functions |8 1. Further attempts to resolve these 
tensions will be necessary for any future update of the 
MSTW08 fit. The ATLAS Collaboration provide differ- 
ential cross sections for W + — > i + v and W~ — > t~v with 
the complete information on correlated systematic un- 
certainties, which is potentially more useful for PDF fits 
than simply A({rjt), and the individual W + — > £ + v and 
W~ — > l~y cross sections seem to be better described by 
the MSTW08 fit than many other PDF sets 1531 . 

Currently, perhaps the least well-known parton dis- 
tributions are the strange-quark and -antiquark distribu- 
tions, where the only experimental information comes 
from the CCFR/NuTeV dimuon cross sections in the 
region 0.01 < x < 0.2. Although these data are in- 
cluded in the MSTW08, CT10 and NNPDF2.1 analy- 
sis, the three groups obtain quite different results for the 
s and s distributions in the data region. These differ- 
ences are not well understood, but may be due to issues 
such as the acceptance calculation, different treatment 
of nuclear corrections, or differences in the treatment 
of charm production in charged-current DIS. In particu- 
lar, the NNPDF2.1 fit includes the contribution initiated 
by the charm-quark PDF, leading to a smaller strange- 
quark PDF compared to the MSTW08 and CT10 fits, 
which do not include this contribution. In the x region 
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outside of the CCFR/NuTeV data, the s and s distribu- 
tions are largely determined by the assumed parameteri- 
sation (with the possible exception of NNPDF). A com- 
plementary, and perhaps theoretically cleaner, process 
to probe strangeness at the LHC is W production with 
an associated charm-tagged jet. The dominant partonic 
subprocesses are sg — > W + c and sg — > W~ c, with 
a small Cabibbo-suppressed contribution from d g — > 
W + c (5%) and dg -> W~ c (15%). A first prelimi- 
nary measurement has been made by CMS [40 1 of the 
cross-section ratios R c = cr(W + c)/cr(W + jets), prob- 
ing the strange content of the proton relative to other 
light-quark flavours, and = cr(W + + c)/cr(W~ + c), 
potentially probing the strange asymmetry. The current 
CMS data |40| do not strongly discriminate between 
current PDF sets, although the measurement of R c is in 
slightly better agreement with MSTW08 and CT10 and 
is more than 1-cr above the NNPDF2.1 prediction. With 
more precise measurements to come, including differ- 
ential distributions, the W+charm process should enable 
powerful constraints to be made on the s and s distribu- 
tions. 

5. Higgs, top-pair and jet production 

Whereas the cross sections for production of W and Z 
bosons are sensitive to the quark distributions, we now 
turn to processes that are sensitive to the gluon distri- 
bution. Of course, one of the main goals of the initial 
LHC physics programme is to either discover or ex- 
clude the Standard Model (SM) Higgs boson (H). This 
requires precise knowledge of the theoretical cross sec- 
tion, where the dominant production mechanism at both 
the Tevatron and LHC is gluon-gluon fusion (gg — > H) 
through a top-quark loop, with the gluon distribution be- 
ing sampled at x ~ M H j y/s. The exclusion limits at 
95% CL. from December 2011 ED |42l |43) are given 
in Table [3] together with the relevant x values probed. 

We will calculate the total cross section (cr H ) for SM 
Higgs boson production (without decay) from gluon- 
gluon fusion via a top-quark loop, with a fixed scale 





M H (GeV) 


x ~ Mnl V* 


Tevatron 


156-177 


0.08 - 0.09 




112.7-115.5 


0.02 - 0.02 


ATLAS 


131-237 


0.02 - 0.03 




251 -453 


0.04 - 0.06 


CMS 


127-600 


0.02 - 0.09 



Table 3: Exclusion limits at 95% C.L. for the SM Higgs boson (as of 
December 201 1) | 41 42 43 1 and the approximate x values probed. 



choice of = hf = Mh- The m, dependence is re- 
tained only at LO, with m, = 171.3 GeV (PDG 2009 
best value), and the higher-order corrections are calcu- 
lated in the limit of an infinite top-quark mass, with 
NNLO corrections from Ref. [44 1. We do not include 
the small bottom-quark loop contributions to the gg — > 
H cross section. The size of the higher-order corrections 
to the gg — > H total cross sections is substantial. Tak- 
ing the appropriate MSTW08 PDFs and a s values con- 
sistently at each perturbative order for cr H with M H = 
160 GeV, then the NLO/LO ratio is 2.1 (Tevatron) or 
1.9 (LHC), the NNLO/LO ratio is 2.7 (Tevatron) or 2.4 
(LHC), and so the NNLO/NLO ratio is 1.3 (Tevatron 
and LHC). The perturbative series is therefore slowly 
convergent, mandating the use of (at least) NNLO cal- 
culations together with the corresponding NNLO PDFs 
and as values. The convergence can be improved by 
using a scale choice ju« = fif — M#/2, which mim- 
ics the effect of soft-gluon resummation. However, we 
aim to study only the PDF and a s dependence of the 
gg — > H cross sections, and we do not aim to come up 
with a single "best" prediction together with a complete 
evaluation of all sources of theoretical uncertainty. 

The ratios of the NNLO gg — » H cross sections with 
respect to the MSTW08 predictions, plotted against the 
SM Higgs mass M H , are shown for the Tevatron and 
LHC in Fig. [9] where PDF+as uncertainty bands at 
68% C.L. are plotted. It can be seen that there is 
good agreement for the two global fits (MSTW08 and 
NNPDF2.1) at NNLO. The central value of HERA- 
PDF1.5 is also in good agreement, but it has a very large 
uncertainty in the upwards direction, and we will re- 
turn to this feature later. The HERAPDF1.0 prediction 
with a s (M|) = 0.1 176 lies somewhat below MSTW08, 
NNPDF2.1 and HERAPDF1.5. However, the ABKM09 
prediction lies even further below MSTW08 at the LHC, 
and especially at the Tevatron, even allowing for the 
PDF+a's uncertainties, by an amount much larger than 
the scale uncertainty (~10%). The Tevatron and LHC 
exclusion bounds are based on predictions using the 
MSTW08 PDFs, and the decision not to use all avail- 
able NNLO PDFs has drawn some misguided criticism 
from certain quarters P31 l46l . particularly before the 
LHC exclusion results became available due to the more 
significant discrepancies between PDF sets for Tevatron 
predictions. 

The gg — > H cross sections at the Tevatron and LHC 
start at 0(a 2 s ) at LO, with anomalously large higher- 
order corrections, therefore they are directly sensitive 
to the value of as(M?,). Moreover, there is a known 
correlation between the value of as and the gluon dis- 
tribution, which additionally affects the gg — » H cross 
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NNLO gg^H at the Tevatron (Js = 1.96 TeV) 
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NNLO gg^H at the LHC (Js = 7 TeV) 
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Figure 9: Ratio to MSTW 2008 NNLO gg -> H total cross section, plotted as a function of M H , with PDF+ffs uncertainty bands at 68% C.L. 



NNLO gg->H at the Tevatron (Js = 1.96 TeV) for = 180 GeV 
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Figure 10: NNLO gg — > H total cross sections, plotted as a function of a s (M^), for M H = 180 GeV. 
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Figure 11: NNLO gluon-gluon luminosities as the ratio with respect to MSTW 2008, plotted as a function of tJs/s. 
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NLO tt cross sections at the LHC (\/s = 7 TeV) 
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Figure 12: NLO and NNLO (approx.) tt total cross sections at the LHC, plotted as a function of as (M|), for m, = 171.3 GeV, and compared to the 
single most precise current LHC measurements from CMS [47 ] and ATLAS |48 |. 



sections. In Fig. 10 we show this sensitivity by plotting as seen from Fig. 11 There is reasonable agreement 



the Higgs cross sections versus as (M|) at the Tevatron 
and LHC for a Higgs mass M H = 180 GeV. The for- 
mat of the plots is the same as in Fig. [6} but in this case 
the effect of the additional a s uncertainty is more size- 
able. It is apparent from the plots that at least part of the 
MSTW08/ABKM09 discrepancy for Higgs cross sec- 
tions is due to using quite different values of as(M^) 
at NNLO, specifically a s {M 2 z ) = 0.1135 ± 0.0014 for 
ABKM09 Ili compared toa s (M|) = 0.1171+0.0014 
for MSTW08 Q]|S]. Comparing cross-section predic- 
tions at the same value of as(M^) would reduce the 
MSTW08/ABKM09 discrepancy at the LHC, but there 
would still be a significant discrepancy at the Tevatron 
(see also the later Table[7]i. 

At LO, the PDF dependence of the gg — > H total 
cross section is simply given by the gluon-gluon lu- 
minosity evaluated at a partonic centre-of-mass energy 
Vs = M H , i.e. 



d -Cgi 
ds 



1 f 1 dx 

—t 

S J T X 



g(x, s)g(T/x,s), 



where g(x,p 2 = s) is the gluon distribution and t = s/s. 



In Fig. 11 we show the gluon-gluon luminosities calcu- 
lated using different PDF sets and taken as the ratio with 
respect to the MSTW 2008 NNLO value, at centre-of- 
mass energies corresponding to the Tevatron and LHC. 
The relevant values of yf§ = Mh — {120, 180, 240} GeV 
are indicated, along with the threshold for tt production 
at the LHC, V$ = 2m, with m, = 171.3 GeV, where 
this process is predominantly ^-initiated at the LHC. 
Indeed, tt production at the LHC is strongly correlated 
with gg — > H production at the Tevatron, with both pro- 
cesses probing the gluon distribution at similar x values, 



for the global fits (MSTW08 and NNPDF2.1), but more 
variation for the other sets, particularly at large S, where 
the HERAPDF1.0 set with a s (M|) = 0.1 176, and espe- 
cially the ABKM09 set, has a much softer high-x gluon 
distribution, and this feature has a direct impact on the 
gg — > H cross sections, particularly at the Tevatron (see 
Fig. [9]). Again, we note that the central value of HER- 
APDF1.5 is in good agreement with the global fits, but 
it has a very large uncertainty in the upwards direction, 
and we will return to this feature later. 

More than 80% of the NLO tt cross section comes 
from the gg channel for the LHC with yfs = 7 TeV, ris- 
ing to almost 90% at -\fs = 14 TeV, compared to less 
than 15% at the Tevatron ( -\fs - 1.96 TeV). The sig- 
nificant difference in the initial parton composition for 
tt production is due partly to the lower Tevatron energy 
(pp collisions at tJs = 1.96 TeV would give around 
50% of the tt cross section from the gg channel), but 
mainly due to the valence-valence nature of the qq — > tt 
channel in pp collisions. The partonic subprocess is 
0{a 2 s ) at LO. There is therefore a strong dependence 
on both the gluon distribution (at x ~ 2m, I yfs - 0.05) 
and a s . We calculate tt production (without decay) for 
a top-quark pole mass m, = 171.3 GeV (PDG 2009 
best value), with a fixed scale choice of pr = pf = 
m,. We show the tt total cross sections at the LHC 
( yj = 7 TeV), plotted as a function of as (M|), in 
Fig. [12} with 68% C.L. PDF+o'.s uncertainties. The 
NNLO calculation of the total cross section for tt pro- 
duction is still in progress, although various approxima- 
tions based on threshold resummation are available. We 
use the hathor |49| public code with the default set- 
tings for an approximate "NNLO" calculation, although 
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Inclusive jet cross sections with MSTW 2008 NLO PDFs 
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Figure 13: Fractional contributions of the gg-, gq- and (/(/-initiated 
processes to inclusive jet production as a function of pj |5 |. 



we make no attempt to quantify the theoretical uncer- 
tainty (other than from PDFs and as)- A more com- 
plete study of the theoretical uncertainties in the approx- 
imate NNLO calculation is clearly important, but it is 
beyond the scope of this work (see Refs. 11501 I5T1 l52l 
for some recent studies). As a rough indication, the 
scale uncertainties are estimated to be 13% at NLO and 
significantly smaller in the approximate NNLO calcu- 
lations (for example, 8% at NLO+NNLL (52l). The 
predicted it cross section has a fairly strong dependence 
on the assumed top-quark mass m,, such that compar- 
ison of the measured cross section with theory predic- 
tions even allows an extraction of m t . As some indi- 
cation of the m, dependence, increasing m, by 2 GeV 
to give a value close to the current Tevatron average of 
m, = 173.2 + 0.9 GeV [33 1 decreases the ti cross section 
at the 7 TeV LHC by about 10 pb (or 6%) at both NLO 
and NNLO with MSTW08 PDFs. Bearing these caveats 
in mind, we compare to the single most precise current 
CMS measurement (e/ji/+jets+b-tag) of [47 1 

erg = 164.4 ± 2.8(stat.) + 1 1.9(syst.) ± 7.4(lumi.) pb, 

which is close to a recent CMS combination of tt cross- 
section measurements giving [55 1 

rx„- = 165.8 ± 2.2(stat.) ± 10.6(syst.) ± 7.8(lumi.) pb, 

and to the single most precise current ATLAS mea- 
surement (using kinematic information of lepton+jets 
events) of (48] 

cr, f - = 179.0 ± 9.8(stat. + syst.) + 6.6(lumi.) pb. 

The approximate NNLO prediction using MSTW08 
PDFs shown in Fig. [12] is consistent with both the AT- 
LAS and CMS measurements, while the central value 
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Figure 14: Ratios of data over theory (using MSTW 2008 PDFs) for 
inclusive jet cross sections at hadron colliders as a function of the 
scaling variable xj = 2pjj *^s. Plot taken from Ref. 1541 . 



using ABKM09 is almost 2-cr below CMS and more 
than 3-cr below ATLAS. The discrepancy for ABKM09 
would increase further if using the more up-to-date 
value of m, = 173.2 + 0.9 GeV (53) rather than 
m, = 171.3 GeV used in the plots, which would re- 
duce all theory predictions by around 6%. The HER- 
APDF1. 0/1.5 sets at NLO are also disfavoured by the 
LHC data, as is the HERAPDF1.0 NNLO set with 
a s (M 2 z ) = 0.1145. 

It would be a worrying situation if the tt cross section 
at the LHC was needed to discriminate between PDF 
sets. The measured tt cross section is commonly used to 
constrain new physics contributions, therefore it is ques- 
tionable whether it should be used directly to constrain 
PDFs. Rather, we would hope that the gluon distribution 
(and as ) would be sufficiently constrained by other data 
sets that the it cross section is a prediction rather than 
a direct PDF constraint. If only the ABKM09 PDF set 
was available, one can only imagine what new physics 
scenarios could be conjured up to explain the "excess" 



of data over theory seen in Fig. 12 



Measurements of the scaling violations of DIS struc- 
ture functions can be used to constrain the small-.* 
gluon, although there is no direct constraint on the large- 
x gluon from inclusive DIS. To constrain the high-x 
gluon distribution we need to look for processes where 
the gluon appears in the initial state at LO, and the best 
example is inclusive jet production at hadron colliders. 
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Figure 15: Jets as a discriminator of the high-x gluon distribution (extract from talk by D. de Florian | 56 |). 



In Fig. [13] we show the composition of the initial state 
as a function of the jet transverse momentum. There is 
a transition from gluon-gluon and gluon-quark at low 
Pt values to quark-quark at high pj values, but even at 
high pj values the gluon-quark contribution is still sig- 
nificant. (Here, we do not distinguish between quarks 
and antiquarks.) Since the quark distributions can be 
constrained by other data sets, it means that jet produc- 
tion mainly constrains the gluon distribution. The plot 



in Fig. 14 taken from Ref. [54 1, shows data plotted as 
a function of the scaling variable xj - 2pr/ V*> which 
is related to the x values probed at central rapidity. The 
current LHC data are generally at lower x T than at the 
Tevatron, so presently the best constraint on the high-x 
gluon comes from the Tevatron jet data. Indeed, other 
than the ti cross section, jets are perhaps the closest ob- 
servable to Higgs in terms of the partonic luminosity, 



kinematics and power of coupling; see Fig. 15 taken 
from Ref. [56|. The crucial question we must address, 
therefore, is how well do the "non-global" fits describe 
the Tevatron jet data? 

We will not consider the less reliable Tevatron Run 
I data, which prefer a much harder high-x gluon dis- 
tribution 1 4 1, and are obtained using less sophisticated 
jet algorithms. The three data sets on inclusive jet pro- 
duction from the Tevatron Run II ll57l l58l l59l were all 
found to be compatible ffl. The MSTW 2008 analy- 
sis [4] included the CDF Run II inclusive jet data using 
the kj jet algorithm [57] and the D0 Run II inclusive 
jet data using a cone jet algorithm |59l . Consistency 
was checked with the CDF Run II inclusive jet data us- 
ing the cone-based Midpoint jet algorithm [58 1, but this 
data set was not included in the final MSTW08 fit, since 
it is essentially the same measurement (using 1.13 fbr 1 ) 
as Ref. |f57l (using 1.0 fbr 1 ), differing mainly by the 



choice of jet algorithm. The kj jet algorithm is theo- 
retically preferred due to its property of infrared safety. 
We therefore focus here on the CDF Run II inclusive 
jet data using the kj jet algorithm l57l . and the descrip- 
tion using NNLO PDFs, but the other Tevatron Run II 
jet data sets and the description using NLO PDFs are 
considered in detail in Ref. Q. 

One obvious problem is that the complete NNLO 
partonic cross section (<x) for inclusive jet production 
is currently unknown, and needs to be approximated 
with the NLO & supplemented by 2-loop threshold cor- 
rections ll27l . We calculate jet cross sections using 
fastnlo l60l (based on nlojet++ |61, 62 1), which in- 
cludes these 2-loop threshold corrections. Following 
the usual way of estimating theoretical uncertainties due 
to unknown higher-order corrections, we take different 
scale choices pu = pf — p — {pj /2, pj,2pr} as some 
indication of the theoretical uncertainty. Smaller scale 
choices raise the partonic cross section, so favour softer 
high-x gluon distributions [4], and the central p — pj 
was chosen for the final MSTW08 fit 0. More com- 
ments on the scale dependence are given in Ref. J2|. 

It is important to account for correlated systematic 
uncertainties of the experimental data points. The full 
correlated error information is accounted for by using a 
goodness-of-fit (x 2 ) definition given by 



X' 



-z 



Di - Ti 



(2) 



where T, are the theory predictions and £>, = D, 



r* 0"™ 1T - are the data points allowed to shift by the 
systematic errors in order to give the best fit. Here, 
i = 1 , . . . , A^pt s . labels the individual data points and 
k = 1 , . . . , A^cou. labels the individual correlated sys- 
tematic errors. The data points D, have uncorrected 
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NNLO PDF 


a s (M 2 z ) 


fi = p T /2 


H = pT 


H = 2pr 


MSTW08 


0.1171 


1.39 (+0.35) 


0.69 (-0.45) 


0.97 (-7.50) 


NNPDF2.1 


0.1190 


0.68 (-0.77) 


0.71 (-2.02) 


0.71 (-3.46) 


HERAPDF1.0 


0.1145 


2.37 (-2.65) 


1.48 (-3.64) 


1.29 (-4.12) 


HERAPDF1.0 


0.1176 


2.24 (-0.48) 


1.13 (-1.60) 


1.09 (-2.23) 


HERAPDF1.5 


0.1176 


1.61 (+7.22) 


0.77 (+0.30) 


1.06 (-0.39) 


ABKM09 


0.1135 


1.53 (-4.27) 


1.23 (-5.05) 


1.44 (-5.65) 


JR09 


0.1124 


0.75 (+0.13) 


1.26 (-0.61) 


2.20 (-7.22) 



Table 4: Values ofx /N v ts. for the CDF Run II inclusive jet data using the kr jet algorithm |57| with A/ pls . = 76 andN can - = 17, for different NNLO 
PDF sets and different scale choices pa = pf = (i = [pj 12, pj, 2pj\. No restriction is imposed on the shift in normalisation and the optimal value 
of "-rinmi " is shown in brackets, where the data points are shifted as Z), — > D,(l - 0.058 Humi.). Values of |ri um J £ [1,3] are shown in italics and 
values IriumiJ > 3 are shown in bold. 



(statistical and systematic) errors cr uncolT - and correlated 
systematic errors o"™ n \ The optimal shifts of the data 
points by the systematic errors, r^, are solved for analyt- 
ically by minimising the x 2 m Eq. There is a clear 
trade-off between the systematic shifts and the param- 
eters of the gluon distribution. Deficiencies in the theory 
calculation can be masked to some extent by large sys- 
tematic shifts, therefore it is important to check that the 
optimal values are not unreasonable. This is straight- 
forward when using a x 2 definition like Eq. (|2j, but is 
more difficult using an equivalent form 

Wpts. Ar ptB . 

X 2 = Yj X (D < " Ti) ( V ~% {Di ' ~ Ti,) ' (3) 

i=l i' = l 

written in terms of the inverse of the experimental co- 
variance matrix, 

AW 

Vc / Lincorr. \2 . \ 1 t:orr. _corr. 
w = S w (cr i ) + ^ o- Ki cr k i , , 

k=l 

as used by the ABKM and NNPDF fitting groups. 
More precisely, NNPDF use a refinement to treat nor- 
malisation errors as multiplicative [65], while Alekhin 
(ABKM) treats all correlated systematic errors as mul- 
tiplicative Il66l I67T 

In Table H we give the x 2 P er data point, calculated 
using Eq. (121, for the CDF Run II data on inclusive jet 
production using the kj jet algorithm [57 1, for differ- 
ent NNLO PDF sets and different scale choices fiR - 
fif — [i — {pr/2, pr,2pr}, where pj is the jet transverse 
momentum. We treat the luminosity uncertainty as any 
other correlated systematic. However, we find that the 
relevant systematic shift n um i ~ 3-5 for some PDF sets 
with soft high-x gluon distributions (e.g. ABKM09 and 
HERAPDF1.0), which is clearly completely unreason- 
able, as it means that the data points are normalised 



downwards by 3-5 times the nominal luminosity uncer- 
tainty (5.8% for CDF). The penalty term rj 2 umi will con- 
tribute only 9-25 units to the total x 2 given by Eq. (|2J, 
which can therefore still lead to reasonably low overall 
X 2 values. 

It is the usual situation at collider experiments that 
the luminosity determination is common to all cross sec- 
tions measured from a given data set, so the requirement 
of a single common luminosity is mandatory when fit- 
ting multiple measurements taken during a single run- 
ning period. All NNLO PDFs are in good agreement 
with the Tevatron W and Z cross sections. If the Teva- 
tron jet data were normalised downwards by 20-30% 
(i.e. 3-5 times the luminosity uncertainty), the Tevatron 
W and Z total cross sections would need to normalised 
downwards by the same amount, resulting in complete 
disagreement with all theory predictions. This exam- 
ple illustrates the utility of simultaneously fitting W and 
Z cross sections together with jet cross sections at the 
Tevatron (and LHC). The luminosity shifts, common to 
both data sets, are effectively determined by the more 
precise W and Z cross sections. The luminosity uncer- 
tainty is then effectively removed from the jet cross sec- 
tions, thereby allowing the jet data to provide a tighter 
constraint on the gluon distribution (and as). 

To avoid these completely unrealistic luminosity 
shifts, n um i ~ 3-5, without going into the complication 
of simultaneously including W and Z cross sections in 
the x 2 computation, we will calculate the x 2 values for 
the Tevatron jet data using Eq. Q, but with the simple 
restriction that the relevant systematic shift |ri um ; | < 1. 
More practically, this means that if | numi. I > 1 f° r an y 
particular PDF set, we fix ri um ; at ±1 and reevaluate 
Eq. (|2]) with the luminosity removed from the list of 
correlated systematics. The results are given in Table|5] 
We highlight in bold the x 2 values lying inside the 90% 
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NNLO PDF 


a s (M 2 ) 


p = p T /2 


P- = Pt 


P = 2pr 


MSTW08 


0.1171 


1.39 (+0.35) 


0.69 (-0.45) 


0.97 (-1.00) 


NNPDF2.1 


0.1190 


0.68 (-0.77) 


0.81 (-1.00) 


1.29 (-1.00) 


HERAPDF1.0 


0.1145 


2.64 (-1.00) 


2.15 (-1.00) 


2.20 (-1.00) 


HERAPDF1.0 


0.1176 


2.24 (-0.48) 


1.17 (-1.00) 


1.23 (-1.00) 


HERAPDF1.5 


0.1176 


1.61 (+1.00) 


0.77 (+0.30) 


1.06 (-0.39) 


ABKM09 


0.1135 


2.55 (-1.00) 


2.76 (-1.00) 


3.41 (-1.00) 


JR09 


0.1124 


0.75 (+0.13) 


1.26 (-0.61) 


2.21 (-1.00) 



Table 5: Same as Table[4] but at most a l-o- shift in normalisation is allowed. The optimal value of "— numi." is shown in brackets, subject to this 
restriction, where the data points are shifted as D, — > D,(l - 0.058 Huml). We highlight in bold those values lying inside the 90% C.L. region, 
defined by Eq. |4|, which gives x 2 /N pts . < 0.83. 



C.L. region defined as 



x 2 < 



&0> 



(4) 



where ^50 and £90 are the 50th and 90th percentiles of 
the x 2 -distribution with N pts , = 76 degrees of freedom. 
(These quantities are defined in detail in Sect. 6.2 of 
Ref. |4).) Here,^ is defined as the lowest^ 2 value of all 
theory predictions, i.e. assumed to be close to the best 
possible fit, so that the rescaling factor Xq/^so in Eq. Q 
empirically accounts for any unusual fluctuations pre- 
venting the best possible fit having^ 2 ^ £50 N pts , j63]. 
The 90% C.L. region given in this way is used to deter- 
mine the PDF uncertainties according to the "dynam- 
ical tolerance" prescription introduced in Ref. flU, so 
PDF sets with x 2 values far outside this region cannot 
be considered to give an acceptable description of the 
data. We see from Table [5] that MSTW08, NNPDF2.1 
and HERAPDF1.5 give an acceptable description for 
p = pj, while HERAPDF1.0 (with the lower as value) 
and ABKM09 give x 2 /N pts . ~ 2-3. The JR09 set and the 
HERAPDF1.0 set with a s {M 2 z ) = 0.1176 give a better 
description, and give predictions for gg — > H cross sec- 
tions at the Tevatron which are closer to the MSTW08 
predictions than those from ABKM09 and the HERA- 
PDF1.0 NNLO set with a s (M z ) = 0.1145. The same 
trend is apparent, but to a somewhat lesser extent, for 
the CDF Run II inclusive jet data using the cone-based 
Midpoint jet algorithm lf58l and the D0 Run II inclusive 
jet data using a cone jet algorithm [59|; see Ref. |2] for 
the details and more discussion. 

To summarise, comparison with Tevatron jet data is 
subtle because of the large correlated systematic uncer- 
tainties. The systematic shifts can compensate for in- 
adequacies in the theory calculation. The traditional x 2 
definition in terms of the experimental covariance ma- 
trix, Eq. can hide such systematic shifts. In par- 
ticular, we find that the Tevatron jet data need to be 



normalised downwards by typically between 3-cr and 
5-cr (see Table |4jl to achieve the best agreement with 
some PDF sets, particularly the ABKM09 predictions. 
Even if the luminosity shift is artificially constrained, 
the other systematic shifts move by large amounts for 
the inclusive jet data, incompatible with the Gaussian 
expectation. No such problems are observed for the 
MSTW08 predictions. It can also be seen from the 
plots in Ref. [68 ] that the unshifted Tevatron jet data lie 
significantly above the theory predictions even after in- 
cluding these data in variants of the ABKM09 fit. Con- 
straining the Tevatron luminosity shifts, for example, so 
that the predicted W and Z cross sections agreed with 
Tevatron data, would increase the constraining power 
of the Tevatron jet data and thereby very likely give a 
larger as and high-x gluon distribution than the cur- 
rent ABM studies l68l . Even with the existing treat- 
ment, the NNLO Tevatron gg — > H cross section for 
Mh = 165 GeV goes up by 15% when including the 
CDF Run II (kj jet algorithm) [57] data set in a variant 
of the ABKM09 fitll68l. 



6. Strong coupling a s from DIS 

A recent claim has been made ll46l that the bulk of 
the MSTW08/ABKM09 difference in both the extracted 
a s (M 2 ) value and the gg — > H predictions is explained 
by the treatment of NMC data ||69l . The differential 
cross section for DIS of charged leptons off nucleons, 
IN — > £X, neglecting the nucleon and lepton masses, 
and assuming single-photon exchange, is 



47TO' 2 



dxdg 2 xQ 4 



l-y + 



y 2 /2 



1 + R(x, Q 2 ) 



F 2 (x,Q 2 ), (5) 



where R = ctiIctt — F1KF2 - Fr) is the ratio of 
the y*N cross sections for longitudinally and trans- 
versely polarised photons, Q 2 is the photon virtual- 
ity, x is the Bjorken variable and y ^ Q 2 /(xs) is the 
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ABKM09 


MSTW08 


Fit NMC cross section 
Fit Q 2 > 2.5 GeV 2 
Fit W 2 > 3.24 GeV 2 
Fit higher-twist 
Separated energies E M 
Correlated systematics 
3 gluon parameters 
No jet data 


Fit NMC F 2 
Fit Q 2 > 2 GeV 2 
Fit W 2 > 15 GeV 2 
Neglect higher-twist 
Averaged energies E M 
Neglect correlations 
7 gluon parameters 
Tevatron jet data 



rameterisation will give very similar results to fitting the 
NMC differential cross sections. In Table [7j we show 
the effect of repeating the MSTW08 NNLO fit with the 
NMC F 2 data extracted using ^1990 on as(M 2 ) and the 
Higgs cross sections (for Mh - 165 GeV) at the Teva- 



Table 6: Main differences in the treatment of NMC data by the 
ABKM09 QH and MSTW08 |4] fits. 



inelasticity (with yfs the (N centre-of-mass energy). 
The ABKM09 [16) analysis fitted the NMC differen- 
tial cross sections directly, calculating Fi to 0{a 2 s ) 
and including empirical higher-twist corrections. The 
MSTW08 [41 analysis instead fitted the NMC F 2 val- 
ues corrected for R, where R(x, Q 2 ) = /?nmcW if 
x < 0.12 or R(x,Q 2 ) = R mo (x,Q 2 ) if x > 0.12 ll69l 
Here, Rnmc(x) was a (Q 2 -independent) value extracted 
from NMC data, while Riggo(x, Q 2 ) was a Q 2 -dependent 
empirical parameterisation of SLAC data dating from 
1990 [70|. By replacing the NMC differential cross- 
section data by NMC F 2 data, ABM |46| found that 
their best-fit a s (M 2 ) moved from 0. 1 1 35 to 0.1170 and 
their gg — > H cross sections at the Tevatron and LHC 
moved closer to the MSTW08 values. ABM |46| there- 
fore concluded that the use of NMC F 2 data in the 
MSTW08 fit rather than the differential cross section is 
the main reason for the higher as {M 2 ,) and Higgs cross 
sections obtained with MSTW08. 

It is certainly more consistent to fit directly to the 
NMC differential cross-section data, and so this rather 
dramatic assertion made by ABM |46| would obviously 
be very worrying if correct. However, there are many 
other differences between the two analyses besides the 
treatment of Fi for the NMC data, with some relevant 
differences given in Tableland where the last row (in- 
clusion of jet data) is highlighted as being the most im- 
portant. Nevertheless, we carried out a detailed investi- 
gation of the sensitivity to NMC data in Ref. |2|. Rather 
than repeat the MSTW08 analysis by fitting the NMC 
differential cross sections, we noted that the original 
NMC paper [69 1 made an alternative extraction of F 2 
values using the SLAC R\^q parameterisation [70|. We 
observed that the MSTW08 NNLO prediction, with F L 
calculated to 0{a 3 s ) and without any higher-twist cor- 
rections, gives a good description of the SLAC Rmo 
parameterisation, demonstrating that fitting the alterna- 
tive NMC F 2 data extracted using the SLAC R\ggo pa- 



tron and LHC, and in Fig. 16 we show the change in 
the gluon distribution at the corresponding scale. We 
make other fits either cutting the NMC F 2 data for 
x < 0.1, above which the R correction in Eq. Q is 
very small indeed, or completely removing all NMC F 2 
data. In all cases there is very little change in as (M 2 ,), 
the gluon distribution, and the Higgs cross section. We 
conclude that the treatment of NMC data cannot explain 
the difference between the MSTW08 and ABKM09 re- 
sults. Similar stability has been found by the NNPDF 
group [71], but in a somewhat less relevant study with 
fixed as . 

The cuts on DIS data are not explicitly given in 
the ABKM09 paper US), but the previous AMP06 
paper [72] mentions that DIS data are removed with 
Q 2 < 2.5 GeV 2 and W 2 < (1.8 GeV) 2 = 3.24 GeV 2 , 
compared to the MSTW08 fit which removes DIS data 
with Q 2 < 2 GeV 2 and W 2 < 15 GeV 2 . The much 
weaker cut on the hadronic invariant mass (squared), 
W 2 Q 2 (l/x - 1), clearly explains why higher-twist 
corrections are more important in the ABKM09 anal- 
ysis. To investigate the possible effect of neglected 
higher-twist corrections on the MSTW08 NNLO fit we 
raised the cuts to remove DIS data with W 2 < 20 GeV 2 
and either Q 2 < 5 GeV 2 or Q 2 < 10 GeV 2 . The results 
are shown in Table [7] and Fig. 16 The changes in as, 
the gluon distribution and the Higgs cross sections are 
generally small and within uncertainties, although with 
the strongest Q 2 cut there is no data constraint below 
x = 10~ 4 and little just above, so the PDFs differ but 
have large uncertainties at low x values. 

In Table [7] and Fig. 16 we show the results of the 



MSTW08 NNLO fit with a fixed a s (M 2 z ) = 0.113 
(slightly below the ABKM09 value), and even in this 
case the gluon distribution and Higgs cross sections 
move only part of the way towards the ABKM09 result, 
as already seen in Fig. [TU] The MSTW08 NNLO input 
gluon parameterisation [4 | has 7 free parameters com- 
pared to only 3 for ABKM09, only 2 for JR09 and HER- 
APDF1.0 (although the value of is optimised in the 
case of JR09), and only 4 for HERAPDF1.5 NNLO. In 
the lack of any direct data constraint on the high-x gluon 
distribution, the other fits are therefore constrained by 
the form of the input parameterisation, avoiding poten- 
tial pathological behaviour such as a negative high-x 
gluon distribution. In an attempt to mimic the ABKM09 
fit we performed a variant of the MSTW08 NNLO fit 
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NNLO PDF 


a s {M 2 ) 


o-fj at Tevatron 


cr H at 7 TeV LHC 


MSTW08 


0.1171 


0.342 pb 


7.91 pb 


Use/? 19 9oforNMCF 2 


0.1167 


-0.7% 


-0.9% 


Cut NMC F 2 (x < 0.1) 


0.1162 


-1.2% 


-2.1% 


Cut all NMC F 2 data 


0.1158 


-0.7% 


-2.1% 


Cut Q 1 < 5 GeV 2 , W 2 < 20 GeV 2 


0.1171 


-1.2% 


+0.4% 


Cut g 2 < 10 GeV 2 , W 2 < 20 GeV 2 


0.1164 


-3.0% 


-1.7% 


Fix a s (M 2 ) 


0.1130 


-11% 


-7.6% 


Input xg > 0, no jets 


0.1139 


-17% 


-4.9% 


ABKM09 


0.1135 


-26% 


-11% 



Table 7: Effect of NMC treatment on asiM 2 ,) and Higgs cross sections (M# = 165 GeV). We also show the effect of raising the cuts imposed on 
the D1S data compared to the default of removing data with Q 2 < 2 GeV 2 and W 2 < 15 GeV 2 . Finally, we show the effect of simply fixing as (M 2 ) 
to be close to the ABKM09 value, or performing a fit with a positive-definite input gluon distribution and no jet data, and we compare directly to 
ABKM09. Table taken from Ref. 
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Z 1.08 
| 1.06 
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(68%C.L.) / 
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Figure 16: Effect of NMC treatment on the gluon distribution at a 
scale Q 2 = (165 GeV) 2 . The values of x = Mnl yfs relevant for 
central production (assuming p^ = 0) of a SM Higgs boson of mass 
Mh = 165 GeV at the Tevatron and LHC are indicated. We also show 
the effect of raising the cuts imposed on the DIS data compared to 
the default of removing data with Q 2 < 2 GeV 2 and W 2 < 15 GeV 2 . 
Finally, we show the effect of simply fixing as (M\) to be close to 
the ABKM09 value, or performing a fit with a positive-definite in- 
put gluon distribution and no jet data, and we compare directly to 
ABKM09. Plot taken from Ref. (2). 



without jet data and with the input gluon distribution 
forced to be positive. The results of this fit are shown 
in Tableland Fig. 16 and it goes some way towards re- 
producing the high-x gluon of the ABKM09 fit and the 
corresponding Tevatron gg — > H prediction, certainly 
closer than we come with other modifications. 

Other differences between the two analyses are that 
ABKM09 used the NMC data for separate muon beam 
energies, whereas MSTW08 used the NMC data aver- 
aged over beam energies, which reduces the maximum 
effect of the change in R for a particular data point, i.e. at 
a given x and Q 2 , a data point at high y, and so very sen- 



sitive to R at a low beam energy, is at lower y for a higher 
beam energy. In the case of the averaged NMC data, 
correlated systematic uncertainties are unavailable, so 
the MSTW08 fit simply added errors (other than nor- 
malisation) in quadrature. As with the Tevatron jet data, 
deficiencies in the theory calculation may be hidden, 
without much trace, by large systematic shifts implicit 



larg( 

in the \ l definition, Eq. (Bl), similar to that used in the 
ABKM09 analysis. We conclude that the greater sensi- 
tivity to the treatment of NMC data found by ABM (46) 
is due to a variety of reasons, but perhaps most signif- 
icantly, the inclusion of higher-twist corrections due to 
the weaker cuts on DIS data, and, as we have repeatedly 
emphasised, the lack of additional constraints provided 
by the Tevatron jet data to pin down the high-x gluon 
distribution. 

Note from Figs. [9] [10} [TT] and [12] that the HERA- 
PDF 1.5 NNLO prediction has a very large uncertainty 
in the upwards direction. For example, the uncertainties 
on the tt cross section can be broken down as 



o-tt 



178^(exp.)!f(model)!"(param.) ± 6(a s ) pb. 



The dominant model uncertainty comes from varying 
the minimum Q 2 cut from the default value of Q 2 ■ = 

*~ ^min 

3.5 GeV 2 . Lowering the cut to g 2 = 2.5 GeV 2 in- 

° ^min 

creases <r„- by 9 pb, while raising the cut to Q 2 min = 
5 GeV 2 increases <T t i by 35 pb, almost all of the to- 
tal uncertainty on <x„- in the upwards direction. It is 
maybe worrying that the NLO version does not exhibit 
the same sensitivity, perhaps because of the more re- 
strictive parameterisation at NLO (only 2 gluon param- 
eters) than at NNLO (4 gluon parameters). As shown in 



Fig. 16 the high-x gluon in the MSTW08 fit is relatively 
insensitive to raising Q^ in — 2 — > {5, 10} GeV 2 , primar- 
ily because the Tevatron jet data stabilise the fit and so 
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lessen sensitivity to the fine details of the treatment of 
the DIS data. 

There is a common lore (see, for example, Ref. [73 1) 
that DIS-only fits prefer low a s (M 2 Z ) values, but Ref. J5] 
showed that not all DIS data sets prefer low as(M z ) 
values. In particular, this was found to be true only 
for BCDMS data, and for E665 and SLAC ep data, 
while NMC, SLAC ed and HERA data preferred high 
a s (M z ) values within the context of the global fit 0. 
See also the recent NNPDF studies using an "unbi- 
ased" PDF parameterisation 1 30 . 31 1. It is well known 
that as is highly anticorrelated with the 1ow-jc gluon 
distribution through scaling violations of HERA data: 
dF 2 /dln(Q 2 ) ~ a s g. Then a s is correlated with the 
high-x gluon distribution through the momentum sum 
rule; see, for example, Fig. 14(b) of Ref. [5 1. Restrictive 
gluon parameterisations, without the negative small-x 
term allowed by MSTW [4|, can therefore bias the ex- 
tracted a s value. For example, the default MSTW08 
NNLO fit obtained a s (M 2 z ) = 0.1171 ± 0.0014, while 
imposing the restriction of a positive input gluon at 
Q\ = 1 GeV 2 gave a best-fit a s (M 2 z ) = 0.1157, but 
with a. x 2 worse by 63 units for the global fit to 2615 
data points. 

What is a s from only DIS data in the MSTW08 
NNLO fit? Recall that the global fit gave a s (M z ) = 

0. 1171 ± 0.0014 |5|. To expand on the studies made 
in Ref. |5|, we performed a new NNLO DIS-only fit, 
which gave a best-fit as(M z ) = 0.1104, but with an in- 
put gluon distribution which went negative for x > 0.4 
due to lack of any data constraint. This implies a nega- 
tive charm structure function, F = harm , and a terrible de- 
scription (x 2 /N pts . ~ 10 including correlated systematic 
errors) of Tevatron jet data using the obtained PDFs. 
A DIS-only fit fixing the high-x gluon parameters to 
prevent such bad behaviour gave a s (M 2 ) = 0.1172, 

1. e. very similar to the global fit. However, a NNLO 
fit which imposed the condition of the positive low-x 
gluon, which stopped the gluon from going negative 
at high x values, and which also omitted the Tevatron 
jet data, gave a s (M z ) = 0.1139, rather closer to the 
ABKM09 value. The very low value of or s (M|) = 
0.1104 found in the DIS-only fit is due to the domi- 
nance of BCDMS data. We can show this explicitly by 
removing the BCDMS data from the DIS-only fit, then 
the best-fit a s (M 2 z ) moves from 0.1104 to 0.1193. Re- 
peating the global fit with BCDMS data removed gives 
&s(M z ) = 0.1181, i.e. a change by less than the quoted 
experimental uncertainty of +0.0014. The conclusion is 
that the Tevatron jet data are vital to pin down the high- 
x gluon, giving a smaller low-x gluon and therefore a 
larger a s in the global fit compared to a DIS-only fit, 



at the expense of some deterioration in the fit quality of 
the BCDMS data. The benefits of including the Teva- 
tron jet data to obtain sensible results in a simultaneous 
fit of PDFs and a s therefore greatly outweighs any dis- 
advantage such as lack of complete NNLO corrections. 

The only input DIS value to the current world average 
a s (M 2 z ) lEHHZD is the BBG06 value d, which is from 
a non-singlet analysis and therefore in principle free of 
assumptions made about the gluon distribution. A value 
of 

a s (M 2 z ) = {0ai48^.0.1134^.0.1141«) 

was obtained at {NLO, NNLO, N 3 LO), by fitting proton 
and deuteron structure functions, F? and Ff, for x > 0.3 
(assuming only valence quarks, neglecting the singlet 
contribution), and the less precise s = 2(F^ - F%) 
for x < 0.3. However, using the MSTW08 NNLO 
central fit, contributions other than valence quarks are 
found to make up about 10% (2%) of F\ at x = 0.3 
(x = 0.5). As an exercise we performed the MSTW08 
NNLO DIS-only fit just to F[ and F d 2 for x > 0.3 (com- 
prising 282 data points, 160 of these from BCDMS), 
which gave a s (M z ) = 0.1103 (0.1130) without (with) 
the singlet contribution included. This is even lower 
than the BBG06 value presumably due to lack of the 
y > 0.3 cut on BCDMS data applied in the BBG06 anal- 
ysis. The low value of a s (M z ) found by BBG06 ||74) is 
therefore due to both dominance of BCDMS data and 
by what we conclude is the unjustified neglect of the 
singlet contribution to F^ and F% for x > 0.3. Given 
that it was argued above that the Tevatron jet data are 
needed to pin down the high-x gluon, we conclude that 
an extraction of a s (M 2 ) only from inclusive DIS data 
is not meaningful, and the closest possible to a reliable 
extraction is the MSTW08 NNLO combined analysis of 
DIS, Drell-Yan and jet data 013], 

a s (Ml) = 0.1171 ± 0.0014 (exp.) + 0.002 (th.), 

or, alternatively, the more recent NNLO determination 
by the NNPDF Collaboration ED, 

a s (Ml) = 0.1173 + 0.0007 (exp.) ± 0.0009 (th.). 

These values are the only NNLO determinations, from 
a simultaneous fit with PDFs, which are in agreement 
with the current world average as(M z ) = 0.1184 + 
0.0007 112811291k seeFig.g] 

With all these problems in as determinations from 
non-global PDF fits, it is therefore disconcerting that the 
201 1 update of the world average as by S. Bethke [75 1, 
intended for the PDG 2012 review, has chosen to treat 
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the MSTW08, ABKM09, JR09 and BBG06 determina- 
tions on an equal footing in forming a "DIS" value of 
a s (M 2 z ) = 0.1148 ± 0.0024, which lies significantly 
below other classes of measurements and the over- 
all preliminary world average of as(M^) - 0.1185 + 
0.0008 CD- 

7. Summary 

The "MSTW 2008" determination of parton distribu- 
tion functions (PDFs) [4, 5 6] is still fairly current with 
no immediate update planned. Another update may be 
appropriate after the final HERA II combined data (in- 
cluding /^ harm ) are published, together with making the 
first use of input data from the LHC, such as differ- 
ential Z/y*, W, jet and W+charm distributions. How- 
ever, provided that the new data are reasonably con- 
sistent with the old data, we would not expect to see 
substantial changes to the PDFs. On the other hand, 
perhaps a more pressing concern is the compatibility 
of existing PDFs from different groups with each other. 
We have updated a recently-published study of bench- 
mark cross sections HI |2) to include results obtained 
using the NNPDF2.1 NNLO ED and HERAPDF1.5 
NNLO (HI PDF sets, and to compare to LHC data on 
W, Z and tt production. Supplementary plots can be 
obtained from a public webpage Q. There is now rea- 
sonably good agreement between the NLO global fits 
from MSTW08, CT10 and NNPDF2.1, all using vari- 
ants of a GM-VFNS to treat DIS structure functions, 
together with the NNLO global fits from MSTW08 and 
NNPDF2.1. More variation is seen with other PDF sets 
using more limited data sets and/or restrictive input PDF 
parameterisations. The latest HERAPDF1.5 NNLO set 
is surprisingly close to MSTW08, at least for the central 
value, unlike the analogous HERAPDF1.5 NLO set, but 
it has a large uncertainty in the high-x gluon distribu- 
tion due to variation of the minimum Q 2 cut, not seen 
at NLO perhaps due to the more restricted parameteri- 
sation. The Tevatron jet data are important to pin down 
the high-* gluon, with an indirect effect on the value 
of as (M|) extracted, reducing sensitivity to the fine de- 
tails of the treatment of DIS data. The LHC measure- 
ments of top-pair production tend to favour PDF sets 
where the high-* gluon is determined using Tevatron jet 
data. Conversely, both the LHC tt cross sections and the 
Tevatron jet data strongly disfavour PDF sets with soft 
high-x gluon distributions and low as values (specif- 
ically, ABKM09 lfl6l ). which give anomalously low 
Higgs cross sections at both the Tevatron and LHC. We 
therefore caution against the use of PDF sets obtained 



from "non-global" fits where the high-.v gluon distribu- 
tion is generally constrained by assumptions made on 
the form of parameterisation in the absence of a direct 
data constraint. 
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